Skip to content

Conversation

mredenti
Copy link

@mredenti mredenti commented Oct 6, 2025

This PR introduces SQLAlchemy Core to abstract database interactions in Reframe's storage backend, adding support for PostgreSQL alongside SQLite. It serves as a proof-of-concept to gather feedback on this approach before adding comprehensive tests and documentation. I have manually tested PostgreSQL with psycopg2 for schema creation, storage, and CLI output. Full PostgreSQL tests in test_storage.py will follow approval.

Motivation

Supporting multiple SQL dialects (SQLite, PostgreSQL, MySQL) is complex due to differences in:

  • Connection APIs (e.g., PostgreSQL needs drivers like psycopg2 or pg8000)
  • Schema syntax (e.g., SQLite's AUTOINCREMENT vs. PostgreSQL's SERIAL)
  • Transaction behaviors and foreign key handling
  • Risk of UUID collisions in concurrent sessions with application-generated IDs

SQLAlchemy Core unifies these interactions, improving maintainability and extensibility.

Next Steps (pending feedback)

  • Add PostgreSQL tests to test_storage.py
  • Update documentation for PostgreSQL configuration
  • Implement database-generated IDs for session UUIDs
  • Support report caching for retry on storage failures

Note Users can replace psycopg2 with their preferred PostgreSQL driver (e.g., pg8000) by updating requirements.txt and postgresql_driver in the system configuration. Suggestions on test cases or PostgreSQL edge cases are welcome.

…e-only behaviour and schema layout.

- Introduce _ConnectionStrategy class and _SqliteConnector sub-classe to
  encapsulate URL building, engine creation, and per-dialect kwargs.
- Build schema with SQLAlchemy Core (MetaData, Table, Column), preserving table/column names and the index_testcases_time index.
- Convert INSERT, SELECT, DELETE queries to Core expressions.
- Enable SQLite foreign keys via PRAGMA foreign_keys=ON on each connection (matches previous cascade semantics).
- Replace raw sqlite3 connection/locking with SQLAlchemy transactional contexts (engine.begin())
- Drop legacy file lock code and direct sqlite3.connect() calls.
- Remove uuids_sql raw string building in remove paths; use SQLAlchemy's in_() instead.
- Introduce _PostgresConnector and wire backend='postgresql' in
  StorageBackend.create().
- Read connection parameters from site config: storage/0/postgresql_driver, postgresql_host, postgresql_port,
    postgresql_db, postgresql_conn_timeout and credentials from env: RFM_POSTGRES_USER / RFM_POSTGRES_PASSWORD.
- Pass connect_timeout via DBAPI connect args.
- Extend config schema (schemas/config.json) with the new Postgres options
  and add sensible defaults (driver=psycopg2, conn_timeout=60).
- Keep SQLite as-is; no behavior change for existing users.
- Add psycopg2-binary==2.9.8 to requirements.txt to provide the PG driver.
- Add ConnectionStrategy.json_column_type property (default Text, overridden
  to JSONB for Postgres).
- Use this type in schema so Postgres stores sessions.json_blob as JSONB.
- Normalize JSONB values in _decode_sessions() by serializing non-strings to
  JSON strings for consistent downstream parsing.
- CLI: display backend info depending on storage backend
  (SQLite file path vs PostgreSQL host:port/db).
- write report as encoded JSON if postgresql
Copy link

codecov bot commented Oct 6, 2025

Codecov Report

❌ Patch coverage is 78.51240% with 26 lines in your changes missing coverage. Please review.
✅ Project coverage is 69.04%. Comparing base (ce5a2e3) to head (c11bf3f).
⚠️ Report is 5 commits behind head on develop.

Files with missing lines Patch % Lines
reframe/frontend/reporting/storage.py 81.25% 21 Missing ⚠️
reframe/frontend/cli.py 44.44% 5 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (ce5a2e3) and HEAD (c11bf3f). Click for more details.

HEAD has 13 uploads less than BASE
Flag BASE (ce5a2e3) HEAD (c11bf3f)
23 10
Additional details and impacted files
@@             Coverage Diff              @@
##           develop    #3567       +/-   ##
============================================
- Coverage    91.22%   69.04%   -22.18%     
============================================
  Files           62       60        -2     
  Lines        13525    13483       -42     
============================================
- Hits         12338     9310     -3028     
- Misses        1187     4173     +2986     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mredenti mredenti changed the title Feat/storage sqlalchemy [feat] Addind support for PostgreSQL storage via SQLAlchemy Oct 7, 2025
@mredenti mredenti changed the title [feat] Addind support for PostgreSQL storage via SQLAlchemy [feat] Adding support for PostgreSQL storage via SQLAlchemy Oct 7, 2025
@vkarak vkarak self-requested a review October 7, 2025 22:16
@vkarak vkarak added request for enhancement reporting Issues related to reporting and processing the test results prio: normal labels Oct 7, 2025
@vkarak vkarak moved this to Todo in ReFrame Backlog Oct 7, 2025
Copy link
Contributor

@vkarak vkarak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look over the key idea and I generally agree with the abstraction. I do still have some questions though that I'd like to be clarified and tested before moving forward, because we want to ensure the current functionality and performance is not compromised (check my comments in the code).

Also, how easy would it be with this design to add another backend, such as DuckDB?

# Update DB file mode
os.chmod(self.__db_file, self.__db_file_mode)

def _db_create_indexes(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are the indexes created in the new design? These are crucial for the performance of time-based queries.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • During the _db_schema() call, the full database schema (tables, columns, foreign keys and indexes) is declared:
def _db_schema(self):
    self.__sessions_table = Table(
        "sessions", self.__metadata,
        Column("uuid", Text, primary_key=True),
        Column("session_start_unix", Float),
        ...,
        Index("index_sessions_time", "session_start_unix"),
    )
 
    self.__testcases_table = Table(
        "testcases", self.__metadata,
        Column("job_completion_time_unix", Float),
        ...,
        Index("index_testcases_time", "job_completion_time_unix"),
    )

The schema definitions are the translated into SQL statements and executed against the database when _db_create() calls:

self.__metadata.create_all(self.__connector.engine)

The create_all() is idempotent and creates any missing tables/columns/indexes without altering existing ones.

  • As a quick check I have inspected the SQLite database generated when running
./test_reframe.py -- unittests/test_reporting.py

and this shows the indexes are there

sqlite3 /tmp/pytest-of-mredenti/pytest-0/test_storage_api0/.reframe/reports/results.db 
SQLite version 3.45.1 2024-01-30 16:01:20
Enter ".help" for usage hints.
sqlite> .schema
CREATE TABLE sessions (
        uuid TEXT NOT NULL, 
        session_start_unix FLOAT, 
        session_end_unix FLOAT, 
        json_blob TEXT, 
        report_file TEXT, 
        PRIMARY KEY (uuid)
);
CREATE INDEX index_sessions_time ON sessions (session_start_unix);
CREATE TABLE metadata (
        schema_version TEXT
);
CREATE TABLE testcases (
        name TEXT, 
        system TEXT, 
        partition TEXT, 
        environ TEXT, 
        job_completion_time_unix FLOAT, 
        session_uuid TEXT, 
        uuid TEXT, 
        FOREIGN KEY(session_uuid) REFERENCES sessions (uuid) ON DELETE CASCADE
);
CREATE INDEX index_testcases_time ON testcases (job_completion_time_unix);

Off course, perhaps it is worth to add a test that queries the database and indeed verifies the schema matches the intended one, including verifying the correct indexes have been created.

Comment on lines -133 to -143
@contextlib.contextmanager
def _db_read(self, *args, **kwargs):
with self.__db_lock.read_lock():
with self._db_connect(*args, **kwargs) as conn:
yield conn

@contextlib.contextmanager
def _db_write(self, *args, **kwargs):
with self.__db_lock.write_lock():
with self._db_connect(*args, **kwargs) as conn:
yield conn
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we guarantee a similar locking mechanism in the new design? These are needed to avoid "database locked errors" when having multiple ReFrame instances updating the DB using the SQLite backend. And, generally, the redesign should allow to extend the underlying backend capabilities with extra functionalities, like here with explicit read/write locking.

@vkarak
Copy link
Contributor

vkarak commented Oct 11, 2025

Also, regarding this:

Implement database-generated IDs for session UUIDs

I don't believe this is necessary for two reasons:

  1. The possibility of collision of UUIDs v4, such as those that we generate, is extremely low.
  2. The DB feature is optional, but UUIDs are still needed and generated in the report. We want also to be able to generate a DB from existing JSON reports.

@vkarak vkarak marked this pull request as draft October 11, 2025 21:00
@mredenti
Copy link
Author

mredenti commented Oct 12, 2025

Also, how easy would it be with this design to add another backend, such as DuckDB?

With this refactor, introducing a new SQL-based backend (e.g. DuckDB) requires implementing a new connector subclass:

class _DuckConnector(_ConnectionStrategy):
    def __init__(self):
        ...

    def _build_connection_url(self):
        return URL.create(
            drivername="duckdb",
            database="/path/to/results.duckdb"
        ).render_as_string()

The idea is indeed to be able to add a new backend in an isolated manner, independently of the storage and SQL logic. So, the _ConnectionStrategy base class defines the abstract interface for all SQL dialects and each connector only needs to provide the SQLAlchemy connection URL, and optionally _connection_kwargs (e.g. connection timeout, driver specific options) as well as a json_column_type property to declare the json blob type for that dialect (for instance PostgreSQL supports TEXT, JSON, JSONB)

Now, once a new dialect has been introduced, the only additional change required in storage.py is in the factory method of StorageBackend.create() to handle:

elif backend == 'duckdb':
    return _SqlStorage(_DuckConnector(), *args, **kwargs)

Commit e162185 shows this for PostgreSQL.

Always serialize the report object to JSON text using `jsonext.dumps()`
before inserting into the database. This prevents `TypeError: Object of
type RunReport is not JSON serializable` errors when using the PostgreSQL
backend (JSONB column). For SQLite, we continue storing the JSON string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

prio: normal reporting Issues related to reporting and processing the test results request for enhancement

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

2 participants